首页> 外文OA文献 >A vector-µSIMD-VLIW architecture for multimedia applications
【2h】

A vector-µSIMD-VLIW architecture for multimedia applications

机译:用于多媒体应用的矢量-μsImD-VLIW架构

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Media processing has motivated strong changes in the focus and design of processors. These applications are composed of heterogeneous regions of code, some of them with high levels of DLP and other ones with only modest amounts of ILP. A common approach to deal with these applications are /spl mu/SIMD-VLIWprocessors. However, the ILP regions fail to scale when we increase the width of the machine, which, on the other hand, is desired to achieve high performance in the DLP regions. In this paper, we propose and evaluate adding vector capabilities to a /spl mu/SIMD-VLIW core to speed-up the execution of the DLP regions, while, at the same time, reducing the fetch bandwidth requirements. Results show that, in the DLP regions, both 2 and 4-issue width vector-/spl mu/SIMD-VLIW architectures outperform a 8-issue width /spl mu/SIMD-VLIW in factors of up to 2.7X and 4.2X (1.6X and 2.1X in average) respectively. As a result, the DLP regions become less than 10% of the total execution time and performance is dominated by the ILP regions.
机译:媒体处理促使处理器的焦点和设计发生了巨大变化。这些应用程序由异构的代码区域组成,其中一些具有高级别的DLP,而另一些则只有少量的ILP。 / spl mu / SIMD-VLIWprocessors是处理这些应用程序的常用方法。但是,当我们增加机器的宽度时,ILP区域无法缩放,而另一方面,希望在DLP区域中实现高性能。在本文中,我们提出并评估了将向量功能添加到/ spl mu / SIMD-VLIW内核的过程,以加快DLP区域的执行速度,同时降低了访存带宽的要求。结果表明,在DLP区域中,2和4个问题宽度矢量-/ spl mu / SIMD-VLIW体系结构都比8个问题宽度/ spl mu / SIMD-VLIW架构高2.7倍和4.2倍(平均分别为1.6倍和2.1倍)。结果,DLP区域不到总执行时间的10%,并且性能由ILP区域支配。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号